In [1]:
from notebook.services.config import ConfigManager
from IPython.paths import locate_profile
cm = ConfigManager(profile_dir=locate_profile(get_ipython().profile))
cm.update('livereveal', {
              'theme': 'solarized',
              'transition': 'zoom',
              'start_slideshow_at': 'selected',
})


Out[1]:
{'start_slideshow_at': 'selected', 'theme': 'solarized', 'transition': 'zoom'}

Recursion and Dictionaries

Dr. Chris Gwilliams

gwilliamsc@cardiff.ac.uk

Overview

  • Scripts in Python
  • Types
  • Methods and Functions
  • Flow control
  • Lists
  • Iteration

    • for loops
    • while loops ## Now
  • Dicts

  • Tuples
  • Iteration vs Recursion
  • Recursion

Dictionaries

Python has many different data structures available (see here)

The dictionary structure is similar to a list, but the index is specified by you.

It is also known as an associative array, where values are mapped to a key.


In [1]:
empty_dict = {}
contact_dict = {
    "name": "Homer",
    "email": "homer@simpsons.com",
    "phone": 999
}
print(contact_dict)


{'phone': 999, 'name': 'Homer', 'email': 'homer@simpsons.com'}

This follows the format of JSON (JavaScript Object Notation).

Keys can be accessed the same way as lists:


In [ ]:
print(contact_dict['email'])

Exercise

  • Create a dictionary with information about the software academy
  • Loop through it and print the values
  • Now use enumerate to print the key index
  • Modify the loop to only print the values where the length is > 5

In [11]:
soft_acc = {
    "post_code": "np20",
    "floor": 3
}
for key in soft_acc:
    print(key)
for key in soft_acc:
    if len(key) > 5:
        print(key)


post_code
floor
post_code

Keys and Values

Dictionaries have methods associated with them to access the keys, values or both from within the dict.

Exercise

Use the dir function (and the Python docs) on your soft_accc dict and write down the 3 methods that can be used to access the keys, values and both


In [17]:
print(soft_acc.keys())
print(soft_acc.values())
print(soft_acc.items())


dict_keys(['post_code', 'floor'])
dict_values(['np20', 3])
dict_items([('post_code', 'np20'), ('floor', 3)])

Exercise

Using the methods you found, write a function that has a dictionary as an argument and loops through the values to return the first value that is of type int

Create a new functiom that does the same but returns the key of that value.


In [14]:
def find_first_int_value(dictionary):
    for val in dictionary.values():
        if type(val) is int:
            return val
        
def find_first_int_key(dictionary):
    for key, val in dictionary.items():
        if type(key) is int:
            return key        
        
def example(dictionary):
    for key, val in enumerate(dictionary):
        if type(key) is int:
            return key        
        
find_first_int_value(soft_acc)
find_first_int_key(soft_acc)
example(dictionary)


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-14-863bec1c83f6> in <module>()
     16 find_first_int_value(soft_acc)
     17 find_first_int_key(soft_acc)
---> 18 example(dictionary)

NameError: name 'dictionary' is not defined

Accessing and Reassigning

With dicts, we can access keys through square bracket notation:

my_dict['key']

or through the get method:

mydict.get('key')

Removing Items

Much like lists, we can pop elements from a dict, but the way this is done is slightly different:

pop() - One must provide the key of the item to be removed and the value is returned. An error is given if nothing was found

popitem() - This works much like pop on a list, removing the last item in the dict and providing the key and the value.

Exercise

  • Create a dict of student numbers as keys and student names as values.
  • Print the third value in the dict using the get method
  • Choose any item in the list and pop it off and save it to a variable
  • Now add it back into the dict
  • Using the docs, explain the difference between pop and popitem return types
  • Using the docs, call the pop method for a key that does not exist, but make it return a string that reads, "Sorry, that student number does not exist"

In [16]:
students = {1234: "gary", 4567: "jen"}
print(students.get(1234))
gary = students.popitem() #how does this differ from pop?
print(gary)
print(students)
#pop gives a value but popitem gives you a tuple
students[gary[0]] = gary[1]
print(students)
print(students.pop(48789492, "Sorry, that student number does not exist"))


gary
(1234, 'gary')
{4567: 'jen'}
{1234: 'gary', 4567: 'jen'}
Sorry, that student number does not exist

Tuples

We have just seen that tuples are a data structure in Python and that they are not as simple as lists or ints!

Tuples are like lists but: they are immutable!

We can access them like lists as well:


In [35]:
my_tuple = 1, 2
print(my_tuple)
new_tuple = 3, "a", 6
print(new_tuple)
print(new_tuple[1])


(1, 2)
(3, 'a', 6)
a

Unpacking Tuples

Tuples can hold any number of values and it is easy enough to access them using square bracket notation.

But, we may receive a tuple that contains many values and this is no fun:


In [39]:
my_tuple = 999, "Dave", "dave@dave.com", True, True
phone = my_tuple[0]
name = my_tuple[1]
email = my_tuple[2]
yawn = my_tuple[3]
still_not_done = my_tuple[4]

In [ ]:
#unpacking tuples: number of names on left MUST match values in tuple
phone, name, email, yawn, still_not_done = my_tuple

Searching Dictionaries

Unlike lists, we are not just looking for values within dictionaries. Since we can specify our own keys, there are many times we may want to search this as well.

Exercise

Using the student dictionary from a few slides back, find the index of the student with the name "jen" using only the name and then the student number and then a tuple of the two.

Hint: Use the methods attached to dicts.


In [20]:
students = {1234: "gary", 4567: "jen"}
print("1. {}".format(1234 in students))
print(1234 in students.keys())
print("jen" in students)
print("gary" in students.values())
print("gary" in students.items())
print((1234, "gary") in students.items())


1. True
True
False
True
False
True

Exercise

Write a script that:

  1. Users can input messages until they type 'q'
  2. Messages are added to a dictionary with the length of the message as the key
  3. Write a function that uses the keys as the input to return the average length
  4. Write a second function that takes the values of the dictionary and sorts them according to length

In [7]:
strings = ['hey', 'a', 'you there', 'what to the what to the what', 'sup', 'oi oi', 'how are you doing, good sir?']

# strings_dict = {}
strings_dict = {33: 'dfhjkdshfjkhdsfjkhdskahfksahkjdfk', 18: 'dkjhskjhskdhffkdjh', 19: 'dfkjsdhfkjdhsfkjhdk', 5: 'kdkdf', 6: 'fdjhfd', 9: 'fkljrwlgj', 28: 'fdjfkdjfkljlskdfjlksdjflsk;a'}
# while True:
#     msg = input("Please type your message here:")
#     if msg is not 'q':
#         strings_dict[len(msg)] = msg
#     else:
#         break

def list_to_dict(strings):
    strings_dict = {}
    for string in strings:
        strings_dict[len(string)] = string
    return strings_dict


def sort_list(input_list):
    is_sorted = True
    for key in range(0, len(input_list)):
        for i in range(0, len(input_list)):
            current = input_list[i]
            if i + 1 < len(input_list):
                if len(current) > len(input_list[i + 1]):
                    input_list[i] = input_list[i + 1]
                    input_list[i + 1] = current
    return input_list
                
        
def mean_length(lengths):
    total_length = 0
    for length in lengths:
        total_length += length
    return total_length/len(lengths)

# strings_dict = list_to_dict(strings)
print("Average string length is {0}".format(mean_length(strings_dict.keys())))
sorted_list = sort_list(list(strings_dict.values()))
print("Sorted list is {0}".format(sorted_list))


Average string length is 16
Sorted list is ['kdkdf', 'fdjhfd', 'fkljrwlgj', 'dkjhskjhskdhffkdjh', 'dfkjsdhfkjdhsfkjhdk', 'fdjfkdjfkljlskdfjlksdjflsk;a', 'dfhjkdshfjkhdsfjkhdskahfksahkjdfk']

Recursion

An iterative function is a function that loops to repeat a block of code.

A recursive function is a function that calls itself until a condition is met.


In [8]:
def sum_until(x):
    if x == 1:
        return x
    else:
        return x + sum_until(x - 1)

print(sum_until(3))


6

What is Happening Here?

Python sees our call to the function and executes it.

Recursion Level

1

Does x equal 1? No Return x + sum_until(2)

2

Does x equal 1? No Return x + (2 + sum_until(1))

3

Does x equal 1? Yes Return x (6)

Exercise

Write a function that takes a list as an input, a start index and checks if the value at that index is greater than the value at the next index. If it is more: swap them. Return the list.

HINT: You must make sure that the index + 1 must be less than the length of the list.


In [29]:
def check_value(input_list, start):
    if(start == len(input_list) - 1):
        return
    elif(input_list[start] > input_list[start+1]):
        current = input_list[start]
        input_list[start] = input_list[start + 1]
        input_list[start + 1] = current
    return input_list

l = [3,1,4]
print(check_value(l, 0))


[1, 3, 4]

Exercise

Now modify the function to use an end index as an argument (which is the length of the list to begin with).

In your check for whether the start index is more than the length of the list, do the following things:

  • call the function again, with the same list as the arguments
    • the start index set to 0
    • the end index decremented
  • Before returning the original list, call the function again but increment the start index
  • Add a check to return the list at the start of the function, if the start index is more than the end

In [30]:
#function receives list, start point and endpoint as args
def recursive_sort(input_list, index, end):
    #if the startpoint goes beyond the endpoint then return
    if index > end:
        return(input_list)
    #if the start point is equal to the end then decrement the end
    if index == end:
        recursive_sort(input_list, 0, end - 1)
    # check if the string at index is longer than the string at index + 1
    # replace it if it is
    # why do we need a temporary variable?
    elif len(input_list[index]) > len(input_list[index + 1]):
        current = input_list[index]
        print("Switching \"{0}\" at {1} for \"{2}\"".format(current, index, input_list[index + 1]))
        input_list[index] = input_list[index + 1]
        input_list[index + 1] = current
    # call the  function again and increment the index
    recursive_sort(input_list, index + 1, end)
    # Why do we need this here?
    return input_list
    
strings = ['hey', 'a', 'you there', 'what to the what to the what', 'sup', 'oi oi', 'how are you doing, good sir?']
sorted_list = recursive_sort(strings, 0, len(strings)-1)
print(sorted_list)


Switching "hey" at 0 for "a"
Switching "what to the what to the what" at 3 for "sup"
Switching "what to the what to the what" at 4 for "oi oi"
Switching "you there" at 2 for "sup"
Switching "you there" at 3 for "oi oi"
['a', 'hey', 'sup', 'oi oi', 'you there', 'what to the what to the what', 'how are you doing, good sir?']

In [10]:
#uncommented
def recursive_sort(input_list, index, end):
    if index > end:
        return(input_list)
    if index == end:
        recursive_sort(input_list, 0, end - 1)
    elif len(input_list[index]) > len(input_list[index + 1]):
        current = input_list[index]
        print("Switching \"{0}\" at {1} for \"{2}\"".format(current, index, input_list[index + 1]))
        input_list[index] = input_list[index + 1]
        input_list[index + 1] = current
    recursive_sort(input_list, index + 1, end)
    return input_list
    
strings = ['hey', 'a', 'you there', 'what to the what to the what', 'sup', 'oi oi', 'how are you doing, good sir?']
sorted_list = recursive_sort(strings, 0, len(strings)-1)
print(sorted_list)


Switching "hey" at 0 for "a"
Switching "what to the what to the what" at 3 for "sup"
Switching "what to the what to the what" at 4 for "oi oi"
Switching "you there" at 2 for "sup"
Switching "you there" at 3 for "oi oi"
['a', 'hey', 'sup', 'oi oi', 'you there', 'what to the what to the what', 'how are you doing, good sir?']

Bubble Sort

Congratulations, you just implemented your first sorting algorithm. You can find more information on the bubble sort here

Recursion vs. Iteration

You have seen both, which is better/faster/more optimal?

While recursive approaches are typically shorter and easier to read. However, it also results in slower code because of all the funtion calls it results in, as well as the risk of a stack overflow when too many calls are made.

Typically, math-based apparoaches will use recursion and most software engineering code will use iteration. Basically, most algorithms will use recursion so you need to understand how it works.

When Should You Use It?

Recursion is often seen as some mythical beast but the breakdown (as we have seen) is quite simple.

However, most (not all) languages are not tuned for recursion and, in performance terms, iteration is often vastly quicker.


In [3]:
import timeit

def recursive_factorial(n):
    if n == 1:
        return 1
    else:
        return n * recursive_factorial(n-1)

def iterative_factorial(n):
    x = 1
    for each in range(1, n + 1):
        x = x * each
    return x
    
print("Timing runs for recursive approach: ")
%timeit for x in range(100): recursive_factorial(500)
print("Timing runs for iterative approach: ")
%timeit for x in range(100): iterative_factorial(500)
# print(timeit.repeat("factorial(10)",number=100000))


Timing runs for recursive approach: 
100 loops, best of 3: 15.3 ms per loop
Timing runs for iterative approach: 
100 loops, best of 3: 7.91 ms per loop

Why the Difference?

To understand this, we need to understand a little bit about how programs are run.

Two key things are the stack and the heap

Stack

Every time a function or a method is called, it is put on the stack to be executed. Recursion uses the stack extensively because each function calls itself (until some condition is met).

See the code below:

def recursive():
    return recursive()

Running this will result in the recursive function being called an infinite number of times. The Python interpreter cannot handle this, so it will shut itself down and cause a stack overflow

Heap

The heap is the space for dynamic allocation of objects. The more objects created, the greater the heap. Although, this is dynamic and can grow as the application grows.

Python also takes care of this for us, by using a garbage collector. This tracks allocations of objects and cleans them up when they are no longer used. We can force things to be cleared by using: del my_var

However, if assigning that variable takes up 50MB, Python may not always clear 50MB when it is deallocated. Why do you think this is?

FizzBuzz Exercise

Write a for loop that goes from 1 to 100 (inclusive) and prints:

  • fizz if the number is a multiple of 3
  • buzz if the number is a multiple of 5
  • fizzbuzz if the number is a multiple of both 3 and 5
  • the value for any other case

Exercise

Now turn this into a function and modify it to not use a for loop and use recursion. I.e. calling the function until the value reaches 100.

Homework

Insertion Sort

The insertion sort is a basic algorithm to build the sorted array in a similar way to the bubble sort.

The list is sorted by looping through all the elements from the index to the end, moving the index along for each loop.


In [ ]:
import random

def sort_numbers(s):
    for i in range(1, len(s)):
        val = s[i]
        j = i - 1
        while (j >= 0) and (s[j] > val):
            s[j+1] = s[j]
            j = j - 1
        s[j+1] = val


# x = eval(input("Enter numbers to be sorted: "))
# x = list(range(0, 10)) #list(x)
x = random.sample(range(1, 1000000001), 100000000)
print(x)
sort_numbers(x)
print(x)